Higher-Order Predictive Information for Learning an Infinite Stream of Episodes

نویسنده

  • Byoung-Tak Zhang
چکیده

We consider the problem of lifelong learning from an indefinite stream of temporal episodes, i.e. a time series consisting of episodes, where the number of the episodes is potentially infinite and the length of each episode varies. Examples of this class of learning include a humanoid robot that continually learns to imitate various human behaviors [7], a computer music system that learns to compose from a continuous stream of music pieces [4], and a cognitive system that incrementally learns visual concepts from a series of movies over a long period of time [9]. What kinds of objective function should the lifelong learner use to balance the short-term and long-term performance? How should the learner optimize its model complexity when the statistics of the episodes change over time? Maximization of the expected future reward, such as a value function used in reinforcement learning, might be useful if we could define rewards for a prespecified goal. For learning an indefinite stream of episodes, we find the mutual information-based measures of information theory, such as predictive information [2], I�Xffffff;Xpppf�, and empowerment [3], maxA I(Xf+1;Af|xf) , suitable. The predictive information is, however, typically approximated by restricting the time horizons to a single time step, i.e. I�Xffffff;Xpppf� = I(Xf+1;Xf) . Though this is exact under the Markov assumption, i.e. the probability of a state depends only on the probability of the previous state, and still can generate explorative behavior [1], the predictive power can be improved by increasing the order of temporal dependency. Here we extend the first-order predictive information to the kth-order predictive information for lifelong learning from a continuous stream of time-series episodes. We generalize the predictive information I(Xf+1;Xf) by replacing the first-order term Xf for past by k-th order history term Xf−k:f = Xf−k,f−k+1,...,f, i.e. I(Xf+1;Xf−k:f). The generalized, higher-order predictive information is written as:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

A Controller Design with ANFIS Architecture Attendant Learning Ability for SSSC-Based Damping Controller Applied in Single Machine Infinite Bus System

Static Synchronous Series Compensator (SSSC) is a series compensating Flexible AC Transmission System (FACTS) controller for maintaining to the power flow control on a transmission line by injecting a voltage in quadrature with the line current and in series mode with the line. In this work, an Adaptive Network-based Fuzzy Inference System controller (ANFISC) has been proposed for controlling o...

متن کامل

مدل ترکیبی تحلیل مؤلفه اصلی احتمالاتی بانظارت در چارچوب کاهش بعد بدون اتلاف برای شناسایی چهره

In this paper, we first proposed the supervised version of probabilistic principal component analysis mixture model. Then, we consider a learning predictive model with projection penalties, as an approach for dimensionality reduction without loss of information for face recognition. In the proposed method, first a local linear underlying manifold of data samples is obtained using the supervised...

متن کامل

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

ON THE INFINITE ORDER MARKOV PROCESSES

The notion of infinite order Markov process is introduced and the Markov property of the flow of information is established.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012